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SUMMARY  OF  WORK    (Use  standard  unreduced  type.   Do  not  exceed  the  space  provided.)  '  ~" 

These  studies  are  directed  toward  evaluating  the  prognostic  power  of  the  electrocardiogram,  when  analyzed  by  advanced 
computer  methodology,  and  the  predictive  accuracy  of  diagnostic  criteria,  when  implemented  in  ECG  computer  programs. 
Appropriate  use  of  digital  signal  processing  in  electrocardiography  requires  application  of  statistically-based  techniques  of 
information  theory  and  mathematically-based  engineering  methods,  as  well  as  knowledge  of  its  clinical  relevance. 

Additional  studies  are  directed  toward  the  analysis  of  heart  rate,  blood  pressure  and  respiratory  signals  that  affect  syncopal 
patients  during  table-tilt  testing,  using  autoregressive  models  and  the  corresponding  power  spectra.  Syncope  can  be 
disabling  for  patients  and,  at  times,  life  threatening.  An  understanding  of  the  autonomic  nervous  system  mechanisms 
responsible  for  syncope  may  indicate  appropriate  therapy. 

These  studies  have  been  re-directed  toward  the  analysis  of  ambulatory  electrocardiography  (AECGs).  Despite  extensive 
Uterature  showing  that  information  extracted  by  computer  analysis  of  AECGs  can  be  related  to  cardiac  risk  factors,  there  are 
no  standard  methods  for  the  routine  analysis  of  AECGs  in  this  rapidly  evolving  field.  The  objective  of  this  research  is  to 
carry  forward  previous  work  in  biosignal  analysis  and  to  adapt  methodologies,  with  the  goal  of  implementing  as  much 
automation  as  possible  to  enable  and  expedite  the  interpretation  of  the  huge  streams  of  AECG  data. 


DEPARTMENT  OF  HEALTH  AND  HUMAN  SERVICES  -  PUBLIC  HEALTH  SERVICE 
NOTICE    OF   INTRAMURAL    RESEARCH    PROJECT 


PROJECT  NUMBER 

Z01  CT0010-20  PSL 


PERIOD  COVERED 

October  1,  1993  to  September  30,  1994 


TITLE  OF  PROJECT      (80  characters  or  less.  Title  must  fit  on  one  line  between  the  borders.) 

Mathematical  and  Computational  Methods  for  Solving  Nonlinear  Equations 


PRINCIPAL  INVESTIGATOR  (List  other  professional  personnel  below  the  Principal  Investigator.)    (Name,  title,  laboratory,  and  institute  affiliation) 

Pi.        R.I.  Shrager  Research  Mathematician  DCRT/PSL 

G.H.  Weiss,  Ph.D.  Chief,  PSL  DCRT/PSL 

P.J.  Munson,  Ph.D.  Section  Chief  DCRT/LSB 

others:     M.S.  Lewis,  Ph.D.  NCRR/BEIP 

S-J.  Kim,  Ph.D.  NCI/DCBDC 

R.  Berger,  Ph.D.  Section  Chief  NHLBI/LCB 

R.  Hendler,  Ph.D.  Section  Chief  NHLBI/LCB 

R.  Carson,  Ph.D.  Research  Mathematician  CC/NMD 


UUUPLHAIINUUNIIS  (H  any) 

University  of  Milan,  Italy  (G.E.  Rovati,  Ph.D.);  J.  Nehru  University,  New  Delhi,  India  (S.  Bose,  Ph.D.);  Washington 
University  School  of  Medicine,  St.  Louis  (D.W.  Myers,  Ph.D.,  G.K.  Ackers,  Ph.D.);  SmithKline  Beecham 
Pharmaceuticals,  King  of  Prussia,  PA  (M.L.  Doyle,  Ph.D.);  University  of  California,  San  Diego  (K.D.  Vandegriff, 
Ph.D.);  Tel-Aviv  University,  Israel  (U.  Shmueli,  Ph.D.,  R.  Schach,  Ph.D.,  I.  Goldberg,  Ph.D.). 


LAB/BRANCH  Physical  Sciences  Laboratory 


NSTrruTE  AND  LOCATION      Qivision  of  Computer  Research  &  Technology,  BIdg.  1 2A,  Room  2007,  Bethesda,  MD 
pn«QP 


TOTAL  MAN-YEARS:  5.0 


PROFESSIONAL:  5.0 


CHECK  APPROPRIATE  BOX(ES) 

n    (a)  Human  subjects  D    (b)  Human  tissues  H       (c)  Neither 

□  (a1)  Minors 
D  (a2)  Interviews 


SUMMARY  OF  WORK    (use  stanaaro  unreaucea  type,   uo  not  exceea  tne  space  provioeo.) 

This  project  helps  investigators  cope  with  complex  equations  that  model  biological  systems,  and  includes  the 
following  studies: 

1)  Ultracentrifuge  (with  M.S.  Lewis,  S-J.  Kim):  DNA-protein  interactions  are  analyzed  using  pseudo-inverse 
matrices. 

2)  Hemoglobins  (with  K.D.  Vandegriff,  R.M.  Winslow,  V.W.  MacDonald,  M.L.  Doyle):  oxygenation  and  oxidation 
of  hemoglobins  are  studied  by  spectrophotometry  and  singular  value  decomposition  (SVD). 

3)  X-ray  crystallography  (with  U.  Shmueli,  R.  Schach,  G.H.  Weiss):  methods  were  developed  for  rapid 
computation  of  the  probability  density  function  used  in  phase  determination,  and  for  improved  estimation  of 
background  radiation  in  X-ray  diffraction. 

4)  Imaging  regional  cerebral  blood  flow  (with  RE.  Carson):  a  method  that  does  not  require  explicit  (and 
invasive)  measurement  of  arterial  flow  was  programmed  and  tested. 

5)  Kinetics  of  Bacteriorhodopsin  (with  R.W.  HendlQr,  S.  Bose):  several  models  of  light-intensity  dependence 
are  being  tested. 

6)  Kinetics  of  cytochrome  aa3  (with  R.W.  Hendler,  S.  Bose):  spectrophotometric  studies  involving  SVD, 
pseudo-inverses,  and  other  methods,  are  in  progress. 

7)  Protein-ligand  binding  (with  P.J.  Munson,  G.E.  Rovati):  a  program  for  nonlinear  least  squares  fitting  of 
binding  data  is  underdevelopment. 
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This  project  has  several  components  related  to  different  biomedical  instrumentation  modalities.  One  type  of 
Study  is  to  find  optimal  designs  for  measuring  in  vivo  first-order  rate  constants  by  means  of  NMR  magnetization 
transfer  experiments.  It  is  important  to  make  these  measurements  as  quickly  as  possible  to  minimize  artefacts 
due  to  physiological  changes  that  might  occur  during  the  course  of  the  experiment.  In  the  course  of  a  project 
currently  being  completed,  an  easily-implemented  optimal  experiment  was  designed,  using  the  assumption  that 
the  experimenter  knows  an  a  priori  range  for  the  rate  constant,  but  also  that  the  associated  spin-lattice  relaxation 
time  (T,)  is  known.  This  somewhat  artificial  assumption  is  dropped  in  the  current  approach  to  this  problem. 

A  project  related  to  many  aspects  of  medical  imaging  has  required  the  development  of  a  simulation  package  to 
examine  problems  raised  by  positron-emission  tomography  (PET)  and  single-photon  emission  tomography 
(SPECT).  For  this  purpose,  a  currently  available  program  (SIMSET,  developed  at  the  University  of  Washington) 
has  been  modified  to  more  accurately  model  the  design  of  equipment  in  general  hospital  use.  Several  projects 
using  this  program  will  be  undertaken  in  the  coming  year. 

A  monograph  written  by  U.  Shmueli  and  G.H.  Weiss,  "An  Introduction  to  Crystallographic  Statistics," w\\\  be 
published  by  Oxford  University  Press  in  the  forthcoming  year. 
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Quantitative  physical  and  mathematical  methods  have  been  applied  to  several  research  problems  in  cell 
biophysics  and  tissue  optics.  In  cell  biophysics,  recent  emphasis  has  been  on  determining  the  mechanical  and 
stoictural  properties  of  large  (mesoscopic)  molecular  structures.  Particular  attention  is  being  given  to  the  lattice 
rearrangements  that  occur  when  a  network  of  clathrin  triskelions  initially  located  on  a  cell  surface  (a  "coated  pit") 
buds  off  to  form  a  basket  ("coated  vesicle").  We  developed  a  set  of  novel  analytical  and  computational  tools  to 
relate  the  shape  variations  of  triskelions  to  the  underlying  mechanical  properties  of  the  molecules.  These 
methods  are  being  used  to  obtain  from  electron  micrographs  quantitative  information  regarding  the  flexibility  of 
the  triskelion  arms  and  the  mechanical  properties  of  the  central  hub  where  the  arms  are  joined.  The  mesoscopic 
stnjcture  of  macromolecular  complexes  are  also  being  probed  by  diffraction  measurements  utilizing  neutrons  or 
light.  During  the  past  year,  we  continued  our  studies  of  agarose  gels,  which  serve  as  models  for  various 
biopolymer  matrices.  Recent  emphasis  has  been  on  understanding  how  solution  properties  affect  network 
junctions,  and  how  gel  stoicture  is  changed  by  applied  electric  fields.  Electric  field  effects  are  only  weakly 
apparent  on  the  length  scales  probed  by  neutrons,  and  to  extend  the  range  of  observation,  a  collaborative 
study  of  small  angle  light  scattering  has  been  initiated  with  investigators  at  Boston  University. 

In  our  investigations  of  the  theory  and  practice  of  tissue  optics,  we  devised  an  optically-based  noninvasive 
method  to  quantify  thermal  damage  in  tissue.  That  method  was  used  to  study  thermal  lesions  induced  in  t)0vine 
myocardium  in  vitro.  Algorithms,  based  on  a  photon  random  walk  treatment  of  light  diffusion,  were  developed  to 
provide  optical  coefficients  from  the  measured  transmittances  and  reflectances.  In  collaboration  with 
investigators  from  the  National  Cancer  Institute,  we  are  presently  using  similar  methods  to  characterize  the 
optical  properties  of  human  breast  tissues.  In  a  related  project,  we  performed  a  theoretical  analysis  of  resolution 
limits  for  time-resolved  imaging  of  tumors  in  human  breast.  Photon  migration  theory  was  used  to  predict  the 
spatial  resolution  of  objects  embedded  at  different  depths  within  a  finite  slab,  and  dependencies  on  scattering 
cross  section,  sample  thickness,  and  photon  transmit  time  were  determined. 
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A  theory  of  the  kinetics  of  the  absorption  of  calcium  into  tx)ne  based  on  a  previously  published  theory  of 
chromatographic  kinetics  has  been  developed  by  G.  Weiss  and  collaborators.  The  model  has  been  tested  on  a 
number  of  different  normal  populations,  yielding  results  to  be  expected  from  general  physiological  principles. 
The  model  has  been  applied  to  patients  with  dermatomyositis  being  treated  with  steroids,  showing  that  the 
drug  regimen  impairs  bone  absorption  to  a  considerable  degree.  Preliminary  measurements  have  been  made 
on  different  disease  populations.  These  studies  will  be  continued  in  the  forthcoming  year. 

J.  Bryngelson  has  developed  a  theoretical  basis  for  protein  folding,  based  on  statistical  properties  of 
conformational  energies.  The  theory  has  been  used  to  improve  the  performance  of  currently  used  protein 
stnjcture  prediction  programs.  A  continuation  of  this  project  relates  to  the  effects  of  water  exclusion  in  the  initial 
collapse  phase  in  protein  folding.  It  has  been  shown  that  hydrogen  bonds  are  increasingly  effective  in 
determining  secondary  stmcture,  as  the  protein  collapses. 

A  theory  has  been  developed  by  G.  Weiss  to  estimate  the  time  for  a  gradient  gel  to  separate  peaks  in 
electrophoresis,  when  diffusion  effects  are  small  but  not  negligible.  This  is  combined  with  a  concurrent 
measurement  by  M.  Garner  and  A.Chrambach  of  boundary  spreading  as  a  function  of  the  gel  concentration. 

Two  monographs  by  G.Weiss  have  appeared  this  year.  Aspects  and  Applications  of  the  Random  Wall< 
(North-Holland,  Amsterdam),  and  Con/empora/y  Problems  in  Statistical  Ptiysics  (Siam,  Philadelphia). 
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In  this  project,  sophisticated  image  processing  techniques  are  used  to  analyze  biomedical  images.  The  goal  is 
to  establish  collaborations  with  biomedical  experts  who  require  new  algorithms  and  possibly  new  hardware 
capability  to  solve  difficult  imaging  problems.  Typically,  complex  new  mathematical  algorithms  as  well  as  new 
combinations  of  existing  algorithms  are  utilized.  We  attempt  to  integrate  the  best  computer  platform  for  each 
problem  with  the  desired  goal  of  the  project,  using  such  diverse  computers  as  an  Apple  Macintosh,  a  DEC  VAX 
or  Alpha,  a  SUN  workstation,  or  an  Intel  iPSC/860  supercomputer. 

Two  current  projects  include  ophthalmic  image  analysis  and  general  consulting  to  the  NIH  scientific  community 
in  biomedical  image  processing,  in  collaboration  with  the  National  Eye  Institute,  we  continue  the  development 
of  systems  to  quantitate  lens  opacities  (cataracts)  and  to  assist  in  diagnosis  of  ocular  diseases.  For  cataract 
studies,  it  was  possible  to  use  the  computer  assisted  instaimentation  to  observe  the  effects  of  anti-cataract 
drugs  or  for  routine  pathological  grading.  During  the  last  year,  we  completed  a  system  that  analyses 
retro-illumination  images.  This  device  projects  light  onto  the  retina  and  then  captures  an  image  of  the  lens  with 
reflected  light.  The  technique  of  reflecting  light  off  the  retina  does  not  always  produce  a  perfect  image,  and 
sometimes  leaves  a  distortion  pattern  in  the  image  of  the  retina.  While  this  distortion  limits  the  device's 
effectiveness,  it  is  the  best  system  available  to  evaluate  the  anterior  and  posterior  sutxapsular  cataracts.  Before 
making  quantitative  morphological  and  densitometric  measurements  on  these  images,  our  software  removes 
the  distortion  pattern. 

An  integral  part  of  our  image  processing  consulting  is  ongoing  support  for  the  NIH  Image  Program  (by  Wayne 
Rasband).  Our  support  includes  continuing  development  of  new  algorithms  and  four  supporting  documents, 
which  are  now  distributed  with  the  package.  These  documents  are  widely  used  and  referenced  both  in  the 
intramural  program  and  by  extramural  biomedical  scientists.  These  documents  include  a  guide  on  how  to  modify 
source  code,  which  is  intended  to  help  scientists  develop  new  user  applications  or  macros,  a  technical  guide 
describing  scientific  application  usage  with  the  package,  a  list  of  frequently  asked  questions  and  answers,  and 
finally  support  for  a  guide  (by  David  Chow)  concerning  analysis  of  gels. 
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In  this  project,  image  processing  techniques  are  used  to  analyze  electron  micrographs.  To  answer  important 
questions  in  stoicturai  biology,  it  is  necessary  to  obtain  relatively  high  resolution  2-  and  3-D  structural 
information  about  biological  macromolecules. 

Biological  specimens  can  be  visualized  in  the  electron  microscope  using  a  number  of  specimen  preparation 
techniques.  Cryo-electron  microscopy,  a  relatively  new  technique,  attempts  to  preserve  "native"  structure  by 
surrounding  the  specimen  with  a  layer  of  ice.  Collaborative  studies  with  LSB,  NIAMS  are  currently  under  way  on 
a  number  of  projects,  whereby  electron  micrograph  images  are  computationally  corrected,  combined,  averaged, 
reconstructed,  or  in  some  way  computationally  enhanced  to  improve  the  signal-to-noise  ratio  or  to  increase  the 
interpretability  of  the  structures  being  visualized.  "Cryo"  images  are  typically  lower  contrast  and  require  greater 
computer  processing  than  conventional  electron  microscopy  to  achieve  satisfactory  results. 

Of  particular  interest  to  our  research  is  the  understanding  of  viral  structures.  At  present  we  are  continuing  our 
efforts  to  investigate  the  structure  of  a  large  animal  virus,  human  herpes  simplex  virus  (type  1).  We  are 
completing  the  localization  of  the  major  capsid  proteins  and  attempting  to  obtain  higher  resolution  structures. 
Biological  material  for  these  herpesvirus  reconstructions  is  provided  through  a  collaboration  with  researches  at 
the  University  of  Virginia,  Charlottesville,  and  from  the  Upjohn  Co.,  Kalamazoo.  The  electron  microscopy  is 
performed  in  LSB,  NIAMS.  Interpretation  of  our  3-D  reconstructions  is  performed  jointly  by  all  collaborators. 

A  number  of  other  collaborative  projects  in  structural  biology  are  currently  in  progress.  We  are  using  3-D 
reconstruction  techniques  to  study  the  structure  of  icosahedral  L-A  virus  (from  yeast),  papillomavirus,  and  polio 
vims.  We  have  compared  the  structures  of  full  (RNA  containing)  L-A  virus  with  empty  L-A  virus.  In  a  new  study  of 
papillomavirus  (in  collatwration  with  NIAMS  and  NCI),  we  have  verified  the  known  structure  of  bovine 
papillomavirus  (bpv),  and  have  recently  obtained  a  3D  reconstruction  of  antibodies  to  the  LI  protein  of  bpv.  We 
hope  to  be  able  to  localize  the  two  major  proteins  of  bpv,  as  well  as  to  understand  more  of  the  function  and 
activity  of  a  number  of  papilloma  antibodies. 
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The  superposition  and  registration  of  differing  tomographic  views  is  a  difficult  problem  for  investigators  attempting  to 
correlate  brain  form  (structure),  derived  from  x-ray  computed  tomography  (CT)  images,  with  brain  function  (metabolism), 
revealed  by  nuclear  medicine  positron  emission  tomography  (PET)  images. 

For  this  reason,  an  attempt  is  being  made  to  develop  techniques  for  the  accurate  correlation  of  CT  structural  data  with  PET 
metabolic  information,  in  order  to  enhance  our  understanding  of  the  processes  underlying  the  generation  of  PET  images. 

Our  approach  has  three  stages:  firstly,  practical  methods  must  be  discovered  for  the  accurate  and  reproducible  placement  of 
the  head  within  a  tomographic  scanner's  aperture;  secondly,  techniques  for  monitoring  head  position  during  the  image 
acquisition  process  must  be  developed  to  correct  for  head  movement  before  the  image  is  generated;  thirdly,  simplified 
algorithms  must  be  found  for  scaling  and  registering  digitized  images  fi'om  different  scanners  on  a  digital  display 
subsystem. 

Precise  orientation  of  the  subject's  skuU  within  the  scanner's  aperture  is  monitored  and  recorded  with  a  PC-based  Polhemus 
position/orientation  measurement  subsystem,  allowing  simultaneous  use  of  two  independent  sensors.  The  development  of 
two  inexpensive  custom-molded  oral  appliances  allows  the  Polhemus  subsystem's  sensor  to  be  fixed  to  the  subject's  skull. 
A  novel  targeting  algorithm  was  derived  to  provide  to  the  system  operator  visual  cues  related  to  head  position  within  a 
scanner's  imaging  volume.  Two-sensor  software  was  completed,  and  extensive  evaluation  has  begun  prior  to  its 
experimental  use  with  test  subjects. 

An  additional  position/orientation  measurement  subsystem  has  been  obtained  and  evaluated  for  linearity  and  for  sensitivity 
to  nearby  metallic  objects,  a  problem  common  to  all  electromagnetic-based  tracking  systems.  This  device's  utilization  of 
quasi-static  fields  was  designed  to  increase  its  immunity  to  close  proximity  of  certain  types  of  metal.  Although 
performance  of  this  new  position  measurement  system  was  good,  it  did  not  outperform  the  Polhemus  system  in  the 
presence  of  PET  scanners. 
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Medical  images  are  an  important  component  of  the  medical  record  generated  during  a 
patient's  hospital  stay  or  clinic  visit.  The  NIH  Clinical  Center  (CC) ,  like  most 
university  and  research  hospitals,  is  attempting  to  solve  the  problem  of 
consolidating  medical  images  with  the  conventional  alphanumeric  medical  record  data 
in  the  Medical  Information  System  (MIS)  to  more  completely  realize  the  goal  of  a 
comprehensive  electronic  medical  record.  DCRT,  CC,  and  NCI  are  collaborating  to 
develop  a  series  of  demonstration  projects  that  explore  image  integration  into  the 
electronic  medical  record. 

Chest  X-rays  are  routinely  obtained  within  the  Diagnostic  Radiology  Department.   In 
this  application,  we  have  been  using  a  Vision  Ten  Rita!  system,  which  contains  a 
gray-scale  sheet  film  digitizer,  as  an  integral  part  of  an  image  gateway. 
Communication  of  medical  images  between  the  Radiology  Department's  Film  Library  and 
remote  sites  is  now  possible.   Future  plans  include  the  connection  of  two  General 
Electric  CT  scanners  into  the  Vision  Ten' image  transmission  and  display  environment. 

In  addition,  we  are  planning  a  prototype, high-speed  image  communication  network  based 
on  Asynchronous  Transfer  Mode  (ATM)  Switch  technology.   The  ATM  Switch  will  allow  155 
Mbit/sec  multi-media  communications  between  users.   This  prototype  network  would 
initially  support  high-performance  radiation  therapy  planning,  which  is  a 
collaborative  effort  between  DCRT ' s  Computational  Bioscience  and  Engineering 
Laboratory  (CBEL)  and  the  NCI  Radiation  Oncology  Branch  (ROB) .   CBEL's  Intel  iPSC/860 
Supercomputer  will  be  utilized  to  apply  the  power  of  parallel  computing  methods  to 
the  computationally  intensive  calculations  required  for  radiation  therapy  planning. 
A  custom-designed  Radiology  Consultation  Workstation  (ROWS)  will  be  located  in  the 
NCI  Radiation  Oncology  Branch  (ROB) ,  as  well  as  in  the  same  building  as  the  CBEL 
Supercomputer . 
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The  goals  of  the  high  performance  biomedical  computing  program  are  to  identify  and  solve  those  computational  problems 
in  biomedicine  that  can  benefit  from  high  performance  hardware,  modem  software  engineering  principles,  and  efficient 
algorithms.  This  effort  includes  providing  high  performance  parallel  computer  systems  for  the  NIH  staff  and  developing 
parallel  algorithms  for  biomedical  applications. 

Using  high  performance  parallel  computers,  biomedical  scientists  can  greatly  reduce  the  time  it  takes  to  complete 
computationally  intensive  tasks  and  take  new  approaches  in  processing  their  data.  This  may  allow  the  inclusion  of  more 
data  in  a  calculation,  the  determination  of  a  more  accurate  result,  a  reduction  in  the  time  needed  to  complete  a  long 
computation,  or  the  implementation  of  a  new  algorithm  or  more  realistic  model.  With  proper  computer  network 
connections  and  interactive  user  interface,  parallel  computing  is  readily  available  to  a  biomedical  researcher  in  the 
laboratory  or  clinic  at  the  investigator's  computer  workstation. 

In  addressing  these  computational  challenges,  CBEL  is  developing  algorithms  for  a  number  of  biomedical  applications  that 
can  benefit  from  computational  speedup,  including  image  processing  of  electron  micrographs,  radiation  treatment  planning, 
medical  imaging,  protein  and  nucleic  acid  sequence  analysis,  human  genetic  linkage  analysis,  protein  folding  prediction, 
nuclear  magnetic  resonance  spectroscopy,  x-ray  crystallography,  quantum  chemical  methods,  and  molecular  dynamics 
simulations.  The  ultimate  goal  is  to  have  high  performance  parallel  computing  facilitate  the  science  that  is  done  at  NIH. 
While  developing  these  computationally  demanding  applications,  CBEL  is  investigating  the  following  high  performance 
computing  issues:  partitioning  a  problem  into  many  parts  that  can  be  independently  executed  on  different  processors; 
designing  algorithms  so  that  delays  of  interprocessor  communication  can  be  kept  to  a  small  fraction  of  the  computation 
time;  designing  the  parts  so  that  the  computing  load  can  be  distributed  evenly  over  the  available  processors  or  dynamically 
balanced;  designing  algorithms  so  that  the  number  of  processors  is  a  parameter  and  the  algorithms  can  be  configured 
dynamically  for  the  available  machine;  developing  tools  and  environments  for  producing  portable  parallel  programs  and 
monitoring  system  performance;  and  proving  that  a  parallel  algorithm  on  a  given  machine  meets  its  specifications. 
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We  are  developing  and  evaluating  statistical  methods  appropriate  to  prediction  of  protein  structure 
from  sequence.  These  methods  include  Fisher  discriminant  analysis,  logistic  discriminant 
analysis,  artificial  neural  networks,  density  estimation  techniques,  cross-validation  and  bootstrap 
techniques,  and  computer  graphical  approaches.  A  new  finding  in  this  field  is  the  apparent  utility 
of  homologous  sequences  in  predicting  the  structure  of  an  index  sequence.  An  overall 
improvement  of  4-5%  is  obtained  using  this  approach,  compared  with  others.  We  have  sought  to 
further  increase  the  efficiency  of  these  algorithms  by  optimizing  the  alignment  of  the  homologous 
sequences,  and  by  making  use  of  ancillary  information,  such  as  the  presence  of  gaps  in  the 
alignment. 

In  a  study  that  attempts  to  refute  the  notion  of  saltatory  or  pulsatile  growth  in  humans,  an  analysis 
of  daily  length  measurements  in  humans  was  made  .  A  new  analysis  method  was  proposed  that  is 
more  efficient  than  previous  approaches,  yet  is  easily  interpreted  graphically  and  provides  a 
precise  definition  of  a  saltatory  growth  process.  Numerical  simulations  confirmed  the 
performance  characteristics  of  the  method. 

The  statistical  analysis  of  the  relationships  between  placental  corticotropin  releasing  hormone 
(CRH)  and  other  hormones  of  the  hypothalamic-pituitary-adrenal  axis  in  third  trimester  pregnancy 
showed  that,  while  adrenocorticotropin  (ACTH)  and  Cortisol  are  correlated  over  the  12  hour 
sampling  period,  CRH  does  not  correlate  significantly  with  ACTH  or  Cortisol  nor  does  it  show 
circadian  variation.  Thus,  there  is  no  evidence  of  a  regulatory  role  of  glucocorticoids  on  placental 
CRH. 

Statistical  and  mathematical  modeling  consultation  and  advice  were  given  to  several  NIH 
investigators  in  areas  of  ligand  binding  and  kinetic  data  analysis.  Refinement  to  the  computer 
programs  LIGAND  and  ALLFTT  were  made,  especially  in  the  area  of  the  user  interface  and 
graphics.  Several  huncired  copies  of  these  programs  were  distributed  to  users  at  NIH  and 
elsewhere. 
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Work  is  ongoing  to  develop  and  provide  an  integrated  framework  for  computational  support  of  research 
in  comparative  DNA/protein  sequence  analysis  and  related  areas  across  multiple  genomes/species.  The 
logic  programming  language  PROLOG  is  used  throughout  this  project,  permitting  data  of  disparate 
types  to  be  combined  rapidly  and  effectively,  and  permitting  complex  queries  from  the  integrated  data. 
Toolset  development  was  performed  in  close  collaboration  with  Drs.  R.  Overbeek  and  R.  Hagstrom  of 
Argonne  National  Laboratory. 

Current  work  focuses  on  the  addition  of  large  volumes  of  data  from  multiple  sources,  resulting  in  a 
unique  resource,  combining  data  from  a  number  of  current  sources  such  as  GenBank,  EMBL,  Prosite, 
SwissProt  and  others  including  metabolic  data  from  Dr.  Overbeek's  Russian  collaborators.  This  will 
form  an  integrated  database  with  DNA  and  protein  sequence,  motif,  metabolic  pathway  and  other  data 
for  multiple  genomes.  The  work  incorporates  analysis  of  genomic  organization  and  genetic  regulation  of 
metabolic  pathways.  This  database  and  associated  tools  will  permit  answers  to  queries  that  are  difficult 
or  impossible  to  satisfy  using  the  standard  biological  databases  currently  available. 

This  database  is  also  the  underlying  data  repository  for  a  World-Wide  Web  (WWW)  hypertext  browser 
implemented  by  R.  Taylor  and  A.  Ginsburg,  DCRT/BIMAS.  It  is  expected  that  this  WWW  service  will 
provide  a  unique  resource  to  the  biomedical  research  community  over  the  Internet,  employing  simple 
and  widely  available  end-user  client  tools,  such  as  NCSA  Mosaic  for  access.  This  will  supplement 
present  WWW  servers  at  NCBI  and  at  EMBL  in  Eirrope,  providing  unique  services  unavailable  to  date. 
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The  on-going  seminar  series  "Topics  in  Analytical  Cytology"  hosted  two  sessions  during  the  year  under 
the  auspices  of  the  NIH  Computer  Training  Program,  with  presentations  by  NTH,  FDA  and  USUHS 
researchers  in  "Flow  and  Image  Cytometry  in  B-cell  Chronic  Lymphocytic  Leukemia,"  "Advanced 
Techniques  in  Quantitative  Fluorescence  Microscopy,"  "Studies  of  Drug  and  Carcinogen  Efflux  in 
Multi-drug  Resistant  Cells  using  Adherent  Cell  Laser  Cytometry"  and  "In  vivo  Confocal  Microscopy  of 
the  Human  Eye". 

The  Cluster  Analysis  Program  (CAP)  has  been  ported  from  its  originally  designed  VAXA^VIS 
minicomputer  and  graphics  terminal  environment  to  a  RISC  OpenVMS  Motif  workstation  platform, 
with  some  necessary  changes  to  computational  algorithms  and  data  structures  to  take  more  complete 
advantage  of  the  RISC  architecture. 

The  Laboratory  Analysis  Package  G-AP)  was  originally  developed  to  run  on  SUN3  UNIX  workstations 
as  a  general-purpose  tool  for  both  interactive  and  batch  processing  of  laboratory  data.  LAP  is  currently 
implemented  in  C-H+  version  2.1.  and  has  been  ported  to  SUN4,  VMS  (VAX  and  Alpha),  and  Convex 
architectures.  It  is  used  extensively  by  two  laboratories  in  NIDDK  and  numerous  Flow  Cytometry  sites 
at  NIH. 
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The  Molecular  Graphics  and  Simulation  Section  studies  problems  of  biological  significance  using 
several  theoretical  techniques:  molecular  dynamics,  molecular  mechanics,  modeling,  ab  initio 
analysis  of  small  molecule  structure,  and  molecular  graphics.  These  techniques  are  applied  to  a 
wide  variety  of  macromolecular  systems. 

Specific  projects  related  to  the  study  of  AIDS  proteins  include:  simulations  of  HIV-1  reverse 
transcriptase,  analysis  of  inhibitor  binding  to  the  active  site  of  HIV- 1  protease,  and  investigation 
of  the  mechanism  of  action  of  HIV- 1  protease. 

Other  research  applied  to  molecules  of  biomedical  interest  uses  molecular  dynamics  simulations  to 
predict  function  or  structures  of  peptides  and  proteins.  Such  projects  include: 

-  Modeling  intermediate  filament  (IF)  proteins 

-  Identification  of  peptides  that  bind  to  human  MHC  DRl 

-  Modeling  the  V3  loop  in  HIV-1  correlating  with  syncytium  formation 

-  Simulation  of  a  large  virus  complex 

Basic  research  is  underway  to  provide  a  better  understanding  of  macromolecular  systems.  The 
projects  include  studies  of: 

-  Temperature  effects  on  protein  dynamics 

-  Effects  of  hydration  on  protein  dynamics 

-  Protein  anharmonicity  and  the  role  of  dihedral  transitions 

-  Molecular  dynamics  simulations  on  Staphylococcal  nuclease:  comparison  with  NMR  data 

-  Harmonic  analysis  of  large  systems 

-  Modeling  and  simulation  of  lipid  bilayers  in  crystal  and  gel  phases 

-  Molecular  dynamics  simulation  studies  of  DNA:  the  B-Z  junction 

-  The  mechanism  of  lysozyme  elucidated  by  quantum  mechanical/molecular  mechanical 
(QM/MM)  techniques 

-  The.  mp/^hanism  of  rihoniirlp.a'^e  A  pliiHrlafprI  hv  OM/MM  tpchniniip.s 


^-ERICX)  COVERED 

October  1, 1993  to  September  30, 1994 


TrrLE  OF  PROJECT     (80  characters  or  less.  Title  must  fit  on  one  line  between  the  borders.) 

Development  of  Theoretical  Methods  for  Studying  Biological  Macromolecules 


•RINCIPAL  INVESTIGATOR       (List  other  professional  personnel  below  the  Principal  Investigator.)   (Name,  title,  laboratory,  and  institute  affiliation) 

PI:       B.R.Brooks,  Ph.D.  Section  Chief  MGS,  LSB,  DCRT 

others:     PJ.  Steinbach,  Ph.D.  Senior  Staff  FeUow  MGS,  LSB,  DCRT 

D.C.  Chatfield,  Ph.D.  NRC  Postdoctoral  Fellow  MGS.  LSB,  DCRT 

M.  Hodoscek,  Ph.D.  Visiting  Fellow  MGS,  LSB,  DCRT 

S.  Mathur,  Ph.D.  Visiting  Scientist  MGS,  LSB,  DCRT 

J.  Zhou,  Ph.D.  Guest  Researcher  MGS,  LSB,  DCRT 

K.  Eurenius,  Ph.D.  NRC  Postdoctoral  Fellow  MGS,  LSB.  DCRT 


DEPARTMENT  OF  HEALTH  AND  HUMAN  SERVICES  •  PUBLIC  HEALTH  SERVICE 
NOTICE    OF   INTRAMURAL    RESEARCH    PROJECT 


PROJECT  NUMBER 

Z01CT00233-04LSB 


COOPERATING  UNITS      (if  any) 

Howard  University  (W.M.  Southerland);  FDA  Center  for  Biologies  Evaluation  and  Research  (R.M. 
Venable,  R.W.  Pastor);  Courant  Institute,  New  York  University,  New  York  (T.  Schlick);  Harvard 
University  (Martin  Karplus  group);  Carnegie  Mellon  Univ.,  Pittsburgh,  PA  (C.L.  Brooks  III). 


Laboratory  of  Structural  Biology 


Molecular  Graphics  and  Simulation 


iNSTfTUTE  AND  LOCATION   National  Institutcs  of  Health,  Bcthcsda,  Maryland  20892 


TOTAL  MAN-YEARS: 


2.2 


PROFESSIONAL: 


2.2 


CHECK  APPROPRIATE  BOX(ES) 

n    (a)  Human  subjects  D    (b)  Human  tissues  S       (c)  Neither 

□  (a1)  Minors 
D  (a2)  Interviews 


SUMMARY  OF  WORK    (use  stanaara  unreduced  type,   uo  not  exceed  me  space  provioea.) 

New  theoretical  techniques  are  often  coupled  with  software  and  hardware  development,  such  as  the 
generation  of  new  simulation  techniques  and  the  systematic  testing  and  evaluation  of  methods.  Specific 
projects  include: 

-  Development  of  Langevin  Piston  methods  for  NPT  simulation  of  periodic  systems  and  for  stochastic 

boundary  molecular  dynamics  (MD)  simulations 

-  Development  of  quantum  mechanical  potentials  and  appropriate  algorithms  for  use  in  molecular  dynamics 

simulatbns 

-  Determination  of  protein  stmcture  by  Nf^R  and  molecular  modelling 

-  Development  of  an  optimized  protocol  for  the  preparation  of  low  temperature  states 

-  Development  of  flexible  MD  techniques  that  remove  high-frequency  degrees  of  freedom 

-  Development  of  the  REPLICA/PATH  method  for  determining  reaction  paths  in  complex  systems  using 

simulated  annealing 

-  Free  energy  perturbation  simulations  in  solution,  examining  the  effect  of  restraints 

-  Conversion  of  physical  models  into  three-dimensional  coordinates  for  computer  analysis  and  simulation 

-  Development  of  ray-traced  molecular  graphics  software  for  HP  workstations,  high-resolution  color  printers 

and  for  movies  using  NTSC  video  equipment 

-  Adaptation  of  a  Tmncated  Newton  minimizer  for  GHARMM  and  biomolecular  applications. 

Parameter  sets  and  models  are  generally  available  for  most  macromolecular  systems,  but  there  is  considerable 
room  for  improvement,  and  alternate  models  that  improve  realism,  or  reduce  computational  costs,  need  to  be 
examined.  This  effort  involves  the  refinement  of  parameters  and  the  exploration  of  alternate  energetic  models 
for  molecules  and  environmental  conditions.  Ongoing  projects  include: 

-  Evaluation  of  parameter  sets 

-  Approximation  of  long-range  interactions  in  macromolecular  simulation  variants  of  the  Ewald  Sum  method, 

using  a  particle  mesh  grid 

-  New  methods  for  long-range  truncation  of  the  energy  potential 

-  Evaluation  and  comparison  of  implicit  and  explicit  water  models  for  simulations  examining  the  hydration  of 

proteins 

-  Molecular  dynamics  simulation  studies  of  DN A:  analysis  of  the  parameter  sets  usinc  an  infinite  DMA  helix 
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With  the  advent  of  new  computer  technology  amenable  to  large-scale  scientific  computing, 
software  and  hardware  development  efforts  are  essential  for  optimal  use  of  these  resources.  The 
efforts  include  the  developing  of  techniques  to  exploit  parallel  multi-machines,  writing  assembler 
code  for  commercial  processors,  and  establishing  a  parallel  workstation  cluster  for  high-efficiency 
simulations  at  low  cost. 

Development  of  methods  and  software  to  make  productive  use  of  parallel  MIMD  machines  for  use 
in  macromolecular  simulations  is  under  way.  The  initial  global  communication  approach  has  been 
successful  in  providing  an  efficient  full-feature  version  of  CHARMM  (Chemistry  at  HARvard 
Macromolecular  Mechanics).  This  parallel  version  of  CHARMM  has  been  extended  to  run  on 
almost  any  MIMD  parallel  computer  platform:  Intel  iPSC/860,  Intel  delta,  CM-5,  EBM/SPl, 
Convex  SPl,  and  on  clusters  of  workstations.  Our  current  development  effort  involves  a  scalable 
algorithm  that  promises  to  greatly  reduce  the  communication  cost  for  very  large  MPP  machines  or 
for  large  workstation  clusters. 

Current  projects  include: 

-  A  scalable  molecular  dynamics  algorithm  for  MPP  machines  and  large  workstation  clusters. 

-  Development  of  parallel  quantum  mechanical/molecular  mechanical  (QM/MM)  methods 

-  Development  and  efficient  use  of  a  high-speed  workstation  cluster  of  HP735s 

-  Development  and  support  of  CHARMM 
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Computational  folding  of  proteins  moved  in  this  year  from  an  abstract  geometry-free  model  developed 
during  the  previous  year  to  a  3-dimensionally  embedded  constraint  evaluation  model.  The 
geometry-free  model  was  implemented  as  a  series  of  topological  connections  between  charged  atoms 
representing  the  hydrophiUc  aspects  of  peptide  biochemistry  and  another  series  of  connections  between 
groups  of  carbon  atoms  representing  the  hydrophobic  aspects.  The  rule-based  manipulation  of 
topological  connections  is,  computationally,  relatively  inexpensive. 

In  order  to  maintain  a  model  that  is  physically  reasonable,  the  sequential  synthesis  of  the  protein  from  N 
to  C  terminus  is  modeled.  In  ribosomal  synthesis,  it  is  only  when  the  peptide  emerges  from  the 
ribosomal  that  it  begins  to  adopt  a  folded  conformation.  The  linearly  increasing  peptide  length  keeps  the 
EXjEOM  computation  time  to  a  minimum.  We  have  found  that  the  pattern  of  hydrophilic  and 
hydrophobic  constraints  develops  with  the  lengthening  of  die  peptide  in  such  a  way  that  the  peptide 
adopts  conformations  that  are  very  close  to  the  crystal  or  NMR  structure  observed  after  ribosomal 
emission.  As  the  sequence  for  a  helical  portion  of  a  peptide  is  emitted,  it  folds  into  a  helix.  As  soon  as 
the  sequence  for  an  antiparaUel  beta  sheet  is  emitted,  it  too  folds  into  die  correct  secondary  structure. 
The  most  striking  result  from  this  year's  simulations  is  the  discovery  that  the  strand-hehx-strand  peptide 
sequence  forms  a  tertiary  structure  with  the  correct  macroscopic  handedness  when  it  is  emitted. 

Ten  classes  of  protein  architecture  are  being  studied  in  parallel,  in  an  attempt  to  make  the  computational 
simulation  of  protein  folding  as  general  as  possible.  We  have  observed  that,  in  all  structural  classes,  the 
correct  local  secondary  structure  is  formed,  and  often  the  correct  macroscopic  handedness  is  also 
formed.  So  far,  the  simulation  program  has  not  been  able  to  reproduce  the  tightiy  packed  atomic 
structure  characteristic  of  crystal  NMR  structures.  Rules  and  parameters  are  continually  added  and 
modified,  in  attempts  to  produce  better  packing. 
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aUMMAHY  Ob  WORK 

The  Nuclear  Medicine  Department  (NMD)  of  the  Clinical  Center  has  developed  a  small 
f ield-of-view  (FOV)  gamma  camera  which  has  great  promise  for  practical, 
high-resolution  imaging  of  small  animals.   The  system  is  based  on  a  single 
position-sensitive  photomultiplier  tube  (PMT).   Unfortunately,  the  position-sensitive 
PMT  does  not  possess  either  a  linear  voltage  analog  of  event  position,  or  a  uniform 
energy  response  across  the  tube  face. 

We  have  developed  a  Multibus  II  Image  Correction  System,  comprising  three  coupled 
386/486  processors,  which  allows  first-order,  geometric  and  energy  corrections  to  be 
performed  sequentially,  in  real-time  on  data  from  the  small  FOV  gamma  camera.   The 
Image  Correction  System  acts  either  as  a  stand-alone,  two-processor  data  acquisition 
system  for  the  small  FOV  gamma  camera,  or  it  is  interposed  between  this  camera  and  a 
commercial  analog  acquisition  system,  and  used  as  a  three-processor  system, 
dynamically  correcting  the  data  transmitted  to  the  Analog  Acquisition  System. 

The  three  processors  are  dedicated  to  input  (analog-to-digital  conversion) , 
computation  (geometric,  energy  and  motion  correction),  and  output  (digital-to-analog 
conversion  or  digital  transmission),  respectively.   Software  for  system  control,  data 
acquisition,  corrected  and  uncorrected  image  display,  and  data/ image  transmission  has 
been  developed.   All  geometric  and  energy  correction  software  has  been  completed.   It 
is  possible  to  acquire  up  to  ten  simultaneous  inputs  via  the  high-speed 
analog-to-digital  converter  module. 

Work  is  also  beginning  on  the  implementation  of  new  algorithms  for  image  acquisition 

in  PET  scanner  mode  that  are  suitable  for  use  in  imaging  the  human  breast .   This 

joint  effort,  involving  DCRT,  CC,  NCI,  and  BEIP  personnel,  has  the  goal  of  early 
detection  of  breast  cancer. 
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The  theme  of  this  work  is  to  develop  a  useful,  accurate  science  of  the  forces  that  organize  biomolecules. 
To  this  end  we  have  accelerated  our  efforts  to  measure  forces  between  proteins,  DNA  double  helices, 
and  polysaccharides.  We  have  also  concluded  a  set  of  studies  on  the  release  of  water  upon  DNA/protein 
and  DNA/drug  binding. 

Force  measurements  between  collagen  triple  helices  have  shown  how  decreasing  temperature,  lowering 
pH,  or  adding  glycerol  can  remove  the  attractive  forces  that  reconstitute  collagen  from  solution.  At  least 
in  this  case,  the  independent  action  of  these  different  changes  in  condition  provides  strong  evidence 
against  the  popular  assumption  that  "hydrophobic  interactions"  stabilize  protein  assembly. 

This  year,  we  published  the  first  of  our  intended  "toolbox"  papers,  which  codify  measured  DNA-DNA 
forces  in  a  form  that  can  be  used  in  computation  and  analysis  of  molecular  assembly.  These  forces  are 
themselves  the  center  of  our  own  investigation  into  the  packing  of  DNA  and  its  packaging  into  ordered 
assemblies,  such  as  in  viruses. 

We  have  begun  an  extensive  series  of  measurements  on  forces  among  stiff  polysaccharides,  the  most 
neglected  of  all  bio-materials.  There  is  a  strong  technological  as  well  as  biological  motivation  for  unders- 
tanding these  interactions. 

This  year  has  seen  the  first  quantitative  measurement  of  the  amount  of  water  released  upon  specific  vs. 
non-specific  binding  of  DNA  to  protein  (lac  repressor)  or  upon  the  binding  of  DNA  to  various  drugs. 
There  is  an  immediate  energetic  connection  between  these  changes  in  molecular  hydration  and  the 
powerful  "hydration  forces"  measured  between  large  molecules  when  they  are  brought  into  contact. 

The  growing  catalog  of  information  about  these  interactions  continues  to  create  a  new  logic  of  thinking 
about  molecular  recognition  and  folding. 
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Our  studies  have  progressed  by  means  of  two  strategies: 

1)  The  structures  of  ionic  channels  can  be  interrogated  by  measuring  their  reaction  to  polymers  of 
varied  size 

2)  Sophisticated  physical  "noise"  analysis  allows  one  to  follow  the  very  rapid  kinetics  of  ionic  channels 
in  several  different  processes,  such  as  the  passage  of  neutral  polymers  dirough  the  channels  or  the 
formation  of  channels  by  drugs  added  to  one  side  of  a  membrane 

Channels  made  from  the  peptide  alamethicin  have  been  observed  while  subjected  to  the  osmotic  action  of 
differendy  sized  neutral  polymers.  It  is  possible  not  only  to  see  the  degree  of  penetration  of  the 
polymers  into  the  channel  from  their  osmotic  action  but  also  to  follow  the  kinetics  of  motion  of  small 
polymers  through  the  ionic  channel. 

These  channels  are  sensitive  to  the  identity  of  the 'phospholipids  in  the  bilayer  into  which  they  are 
incorporated;  in  particular,  there  is  a  strong  correlation  between  the  probability  of  high-conductance 
states  and  the  tendency  of  the  phospholipid  to  form  non-Iamellar  structures. 

The  Hofmeister  effect  is  shown  to  apply  to  transport  properties  of  ionic  channels.  Chaotropic  anions 
bind  to  roflamycoin  channels  for  longer  times,  increase  their  conductance  and  induce  cationic  selectivity 
according  to  their  position  in  Hofmeister  series. 

Studies  of  the  one-sided  action  of  the  drug  amphotericin  B  (with  the  drug  added  only  from  one  side  of 
the  bilayer)  were  conduced  on  cholesterol-  and  ergosterol-containing  bilayers.  As  administered,  drugs 
act  predominantly  from  one  side;  furthermore,  the  differential  toxicity  of  drugs  appears  to  depend  on  the 
different  sterol  content  of  the  tissue  and  the  infectious  agent;  thus,  this  would  appear  to  be  the 
appropriate  protocol  for  determining  toxicity.  Differences  of  drug  action  were  in  accord  with  expected 
discrimination  under  administration. 
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A  program  in  the  Mathematica  language  was  prepared  for  ALLFTT  analysis  making  available  a  more 
generalized  system  of  models.  Due  to  problems  with  curve-fitting  and  retirement  of  the  author  of  the 
program,  this  program  was  not  made  a  production  system. 

A  package  of  Mathematica  functions  for  manipulation  of  polynomials  with  multiple  variables  was 
completed,  and  a  talk  was  given  on  it  at  the  Mathematica  Developers  Conference  held  in  April  1994.  A 
manuscript  describing  this  package  was  submitted  for  publication  in  the  Mathematica  Journal. 

Research  in  neural  networks  and  preliminary  investigations  of  the  Boltzmann  machine  and  the  Gibbs 
sampler  were  discontinued  due  to  the  retirement  of  Dr.  Hutchinson. 
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SUMMARY  OF  WORK    (Use  standard  unreduced  type.    Do  not  exceed  the  space  provided.) 

The  Distributed  Systems  Section  (DSS)  of  the  Computing  Facilities  Branch  is  pursuing  a  long-term  investigation 
of  scientific  and  administrative  applications  of  object  technology,  such  as  object-oriented  analysis,  design,  and 
programming,  object-oriented  user  interfaces,  object-oriented  database  management  systems  (OODBMS),  and 
object-based  distributed  computing  systems.  This  project  is  a  continuation  and  extension  of  our  previous  wor1<, 
begun  as  part  of  the  Advanced  Laboratory  Workstation  (ALW)  Project,  on  object-oriented  programming  in  C++, 
the  01  user  interface  toolkit  and  builder,  the  ObjectStore  OODBMS,  and  of  our  interest  in  emerging  distributed 
system  standards,  such  as  the  OSF  Distributed  Management  Environment  (DME)  and  the  Object  Management 
Group's  (OMG)  Object  Management  Architecture  (OMA)  and  Common  Object  Request  Broker  Architecture 
(CORBA). 

In  FY94,  we  completed  and  deployed  for  beta  test  the  first  version  of  xemt,  our  first  major  C-f-i-  software 
application  to  use  the  01  user  interface  toolkit  and  builder.  Xemt  provides  a  graphical  user  interface  to  the 
Environment  Maintenance  Tool  (EMT),  which  manages  applications  software  for  the  Advanced  Laboratory 
Workstation  (ALW)  system.  This  enables  application  maintainers  and  developers  to  more  easily  manage  their 
own  software  collections,  and  to  integrate  them  into  the  ALW  environment.  Unfortunately,  upgrading  to  AFS 
3.3  caused  EMT  to  no  longer  work,  so  xemt  cannot  be  used  until  the  maintainers  of  EMT  correct  the 
incompatibility. 

We  have  procured  JAM,  an  object-oriented  4GL,  and  have  begun  using  it  to  develop  a  business  system  to 
support  ALW  hardware  and  software  maintenance. 

We  have  also  purchased  several  leading  C-f-f  class  libraries  to  support  C-F+  development  as  alternatives  to  the 
NIH  Class  Library,  which  we  no  longer  have  the  resources  to  maintain. 

Finally,  we  continued  our  membership  in  the  Object  Management  Group  (OMG),  an  industry  organization 
dedicated  to  producing  a  framework  and  specifications  for  commercially  available  object-oriented  environments. 
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SUMMARY  OF  WORK    (Use  standard  unreduced  type.   Do  not  exceed  the  space  provided.) 

The  Computing  Facilities  Branch,  the  Communications  Technology  Section  of  the  Personal  Computing 
Branch,  and  the  Scientific  Computing  Resource  Center  will  collaborate  on  developing  a  successor  to  the 
Advanced  Laboratory  Workstation  (ALW)  system  based  on  the  Open  Software  Foundation's  Distributed 
Computing  Environment  (DCE),  and  will  also  devise  and  carry  out  a  plan  for  migrating  the  ALW  system  to  its  DCE 
successor.  Migration  to  DCE  is  necessary  because  DCE,  as  an  emerging  de  facto  industry  standard,  will 
eventually  supersede  the  AFS  distributed  file  system  upon  which  the  current  ALW  system  is  based.  Also,  DCE 
will  allow  us  to  extend  ALW  distributed  systems  technology  to  the  PC,  Macintosh,  and  the  Convex  and  IBM 
mainframes,  thereby  advancing  DCRT's  strategic  plan  to  provide  interoperability  among  these  systems. 

In  FY94,  we  set  up  the  hardware  and  software  needed  for  a  small  DCE  test  cell,  running  DCE  core  services 
only  (no  distributed  file  system). 

We  played  a  prominent  role  in  architectural  management  activities,  contributing  to  the  Architectural 
Management  Staff  (AMS)  retreat,  facilitated  by  the  Gartner  Group  and  the  AMS  NOS  and  E-mail  subcommittees. 

We  successfully  conducted  a  beta  test  of  netatalk,  a  free  software  package  developed  at  the  University  of 
Michigan,  which  enables  Apple  Macintosh  computers  to  access  AFS  files.  However,  security  and  performance 
need  to  be  improved  before  we  release  it  for  production  use. 

We  have  begun  a  partnership  with  UniPress  Software,  Inc.,  to  add  support  for  AFS  to  their  LAN-Manager  for 
UNIX  (LMU)  product.  If  successful,  this  will  enable  PCs  running  DOS  and  Windows  to  access  AFS  files.  We  have 
verified  that  LMU  can  already  read  and  write  AFS  files,  txjt  that  itdoes  not  perform  authentication.  We  have 
developed  an  interface  specification  between  the  LMU  server  and  an  AFS  authentication  library,  which  we  will 
implement. 

We  received  two  DCE-based  software  products:  Encina,  a  distributed  transaction  monitoring  system  that  can 
provide  connectivity  between  DOS  Windows  and  UNIX  clients  and  DB2  running  under  MVS,  and  DAZEL,  a 
distributed  document  delivery  system.  We  have  not  had  sufficient  staff  time  to  install  Encina,  which  is  an 
extremely  complex  system.  We  have  installed  DAZEL,  but  have  not  yet  gotten  it  to  work  satisfactorily-it  is  still  an 
immature,  overpriced  product. 

We  assisted  the  newly  formed  Customer  Services  Branch  (CSB)  this  year  in  setting  up  UNIX  servers  for  an 
electronic  Help  Desk  support  system. 
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SUMMARY  Oh  WORK 

Clinical  color  Doppler  ultrasound  technology  is  a  popular,  non-invasive,  real-time, 
relatively  inexpensive  imaging  modality,  which  currently  allows  the  2D  visualization 
of  blood  flow  within  the  heart  and  the  vascular  system.   Doppler  ultrasound  flow 
velocity  measurement  is  important  for  the  determination  of  blood/oxygen  supply  to 
various  organs,  of  arterial  wall  shear  stress  and  blood-tissue  gas  exchange,  as  well 
as  for  the  evaluation  of  myocardial  and  valvular  function. 

Initially,  we  have  chosen  to  concentrate  on  the  structure  and  flow  in  the  carotid 
artery,  due  to  the  simplifications  which  this  geometry  allows.   We  have  assembled 
instrumentation  within  a  clinical  echocardiography  laboratory  to  acquire  color 
Doppler  ultrasound  images  along  with  time-encoded  position/orientation  data  for  the 
handheld  transducer.   A  carotid  artery/neck  phantom  was  designed  and  fabricated  to 
allow  for  calibration  and  testing  of  both  the  position/orientation  measurement 
subsystem  and  the  Doppler  flow  velocity  measurement  subsystem. 

Flow  velocity  images  have  been  transferred  from  the  HP  SONOS  1500  ultrasound  system, 
as  separate  digital  values  of  structure  and  flow  velocity,  onto  the  Macintosh  Quadra 
950  microcomputer,  which  is  the  heart  of  our  image  reconstruction  system.   All 
algorithms  and  procedures  for  correcting  the  flow  velocity  readings  have  been 
designed  and  outlined  in  detail,  and  all  software  has  been  described  in  flowcharts. 

A  patent  application,  covering  the  basic  algorithm  for  correcting  the  color  flow 
velocity  measurements,  is  in  process.  This  project  is  otherwise,  currently  inactive 
at  the  NIH;  however,  work  in  this  area  is  continuing  at  the  Technion  in  Israel,  under 
the  direction  of  the  PI.   It  is  hoped  that  our  contribution  may  eventually  find  wide 
use  in  the  non-invasive  measurement  of  blood  flow  velocity,  in  research  as  well  as  in 
clinical  practice. 
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DCRT  is  making  available  an  AppUed  Biosystems,  Inc.  Inherit  (tm)  system  as  a  shared  resource  to  the 
NIH  intramural  research  community.  This  system  employs  a  chent/server  architecture  using  an  Apple 
Macintosh  computer  as  the  cUent  platform.  Scientists  can  purchase  cUent  software  from  ABI  and  access 
the  Inherit  (tm)  system  over  the  NIH  network. 

To  speed  results.  Inherit  (tm)  makes  use  of  highly  specialized  hardware.  The  Fast  Data  Finder  (FDF) 
parallel  processor  can  perform  parallel  pattern  matching  searches  through  large  databases  at  a  rate  of 
over  15  million  characters  per  second.  This  speed  permits  completion  in  hours  of  tasks  that  often  require 
days  using  powerful  UNIX  (tm)  workstations. 

The  system  is  best  suited  to:  (1)  assembly  of  medium  to  large  sequences;  (2)  searching  gene  and  protein 
databases  for  sequence  homologies;  and  (3)  rapid  searches  for  genetic  motifs  such  as  regulatory 
elements.  An  integral  pattern  description  language  permits  construction  of  very  complex  queries.  DCRT 
has  provided  considerable  feedback  to  ABI  to  improve  the  chent  user  interface,  and  has  explored  the 
possibility  of  porting  client  software  to  additional  platforms,  such  as  UNIX  (tm)  workstations  or  the 
NIH  CONVEX/SGI  server. 

This  project  highlights  the  potential  of  the  NIH  network  to  bring  powerful  and  sophisticated  resources 
through  desktop  computers  to  the  scientist's  benchtop. 

Inherit  is  a  trademark  of  Apphed  Biosystems,  Inc. 
UNIX  is  a  trademark  of  X-Open. 
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In  an  ongoing  collaboration  with  Dr.  L.  Staudt,  NCI,  an  attempt  is  being  made  to  discover  novel  human 
lymphoid-specific  genes  by  automated  DNA  sequencing  of  subtracted  cDNA  libraries.  Software  tools 
developed  by  DCRT  are  used  to  process  and  place  the  data  into  a  SYBASE  relational  database  system. 
These  include  tools  for  prescreening  cDNA  sequence  against  a  local  database,  automated  searching 
against  the  nonredundant  databases  on  the  NCBI  network  BLAST  server,  providing  display  of  the 
results,  and  allowing  user  interaction  to  select  information  to  be  placed  into  the  SYBASE  database. 

Work  is  under  way  to  provide  software  to  perform  complex  motif  pattern  matching  analyses,  such  as 
searches  for  nuclear  localization  signals,  on  the  cDNA  sequences.  This  software,  based  on  Genobase 
and  its  associated  toolkit,  will  permit  automated  incorporation  of  results  into  the  SYBASE  database, 
with  a  graphical  user  interface  for  input  and  editing  of  search  parameters. 

To  date,  thousands  of  cDNA  sequences  have  been  analyzed,  yielding  homologies  to  a  variety  of 
proteins,  including  transcriptional  regulators,  signal  transduction  proteins  and  membrane  receptors. 
Work  is  in  progress  to  expand  the  scope  of  the  database  to  include  laboratory  management  information 
and  data  from  other  sources,  such  as  northern  blots. 
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In  collaboration  with  Dr.  M.  Miller,  NCI,  a  critical,  quantitative  analysis  was  done  of  several 
commercial  sequence  assembly  and  analysis  packages.  A  fundamental  problem  in  contemporary 
molecular  biology  is  the  determination  and  interpretation  of  DNA  sequences.  Due  to  limitations  of 
current  sequencing  technology,  sequence  determination  entails  the  piecing  together  of  short,  overlapping 
sequence  fragments  into  a  single,  long  contiguous  sequence.  A  number  of  commercial  computer 
programs  have  been  marketed  to  automate  this  process.  While  reviews  of  individual  packages  have  been 
published,  this  is  the  first  known  study  that  critically  compares  the  accuracy  of  assembly  by  these 
programs. 

Eleven  programs  were  selected,  primarily  on  the  basis  of  their  availability  on  the  NIH  campus. 
Sequence  data  is  not  random,  but  contains  ordered  repeated  sequences.  Likewise,  errors  in  sequencing 
determinations  are  not  randomly  distributed.  In  order  to  provide  a  controlled  and  realistic  dataset  for 
measuring  performance  and  accuracy,  a  known  sequence,  the  rat  multidrug  resistance  gene 
(RATMDRM,  5254  base  pairs,  accession  number  M62425)  was  split  into  58  random  overlapping 
fragments  of  200  to  400  base  pairs  in  length.  These  were  then  randomly  seeded  with  0  to  15%  error 
based  on  the  error  distribution  of  the  fragments  originally  used  to  determine  the  sequence.  Errors  were 
in  the  form  of  miscalled  bases,  deleted  bases  or  added  bases. 

The  programs  tested  fell  into  three  general  groups  based  on  accuracy.  In  order  to  rule  out  conditions 
unique  to  the  chosen  test  sequence,  four  other  sequences  of  between  4500  and  4600  base  pairs  were 
used  to  repeat  the  tests.  With  one  exception,  the  error  rates  were  comparable  to  those  encountered  using 
RATMDRM.  Additionally,  some  programs  were  tested  with  different  permutations  of  RATMDRM  to 
ascertain  their  capacity  to  properly  assemble  the  sequence  regardless  of  the  order  of  input  of  the 
fragments.  Ease  of  editing  the  assembled  sequences  was  also  compared.  Results  of  this  study  were 
accepted  for  publication  by  the  Journal  of  Biological  Computation. 
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